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Real-space grids and the Octopus code as tools for the development of new simulation 
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Real-space grids are a powerful alternative for the simulation of electronic systems. One of the 
main advantages of the approach is the flexibility and simplicity of working directly in real space 
where the different fields are discretized on a grid, combined with competitive numerical performance 
and great potential for parallelization. These properties constitute a great advantage at the time of 
implementing and testing new physical models. Based on our experience with the Octopus code, in 
this article we discuss how the real-space approach has allowed for the recent development of new 
ideas for the simulation of electronic systems. Among these applications are approaches to calculate 
response properties, modeling of photoemission, optimal control of quantum systems, simulation 
of plasmonic systems, and the exact solution of the Schrodinger equation for low-dimensionality 
systems. 


I. INTRODUCTION 

The development of theoretical methods for the sim¬ 
ulation of electronic systems is an active area of study. 
This interest has been fueled by the success of theoreti¬ 
cal tools like density functional theory (DFT) [Tl[2] that 
can predict many properties with good accuracy at a rel¬ 
atively modest computational cost. On the other hand, 
these same tools are not good enough for many applica¬ 
tions [3], and more accurate and more efficient methods 
are required. 

Gurrent research in the area covers a broad range of 
aspects of electronic structure simulations: the develop¬ 
ment of novel theoretical frameworks, new or improved 
methods to calculate properties within existing theories, 
or even more efficient and scalable algorithms. In most 
cases, this theoretical work requires the development of 
test implementations to assess the properties and predic¬ 
tive power of the new methods. 

The development of methods for the simulations of 
electrons requires continual feedback and iteration be¬ 
tween theory and results from implementation, so the 
translation to code of new theory needs to be easy to im¬ 


plement and to modify. This is a factor that is usually 
not considered when comparing the broad range of meth¬ 
ods and codes used by chemists, physicists and material 
scientists. 

The most popular representations for electronic struc¬ 
ture rely on basis sets that usually have a certain physical 
connection to the system being simulated. In chemistry, 
the method of choice is to use atomic orbitals as a basis 
to describe the orbitals of a molecule. When these atomic 
orbitals are expanded in Gaussian functions, it leads to 
an efficient method as many integrals can be calculated 
from analytic formulae [4]. In condensed-matter physics, 
the traditional basis is a set of plane waves, which corre¬ 
spond to the eigenstates of a homogeneous electron gas. 
These physics-inspired basis sets have, however, some 
limitations. For example, it is not trivial to simulate 
crystalline systems using atomic orbitals [5], and, on the 
other hand, in plane-wave approaches finite systems must 
be approximated as periodic systems using a super-cell 
approach. 

Several alternatives to atomic-orbital and plane-wave 
basis sets exist EHin]. One particular approach that does 
not depend on a basis set uses a grid to directly repre¬ 
sent fields in real-space. The method was pioneered by 
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Becke m, who used a combination of radial grids cen¬ 
tered around each atom. In 1994 Chelikowsky, Troullier 
and Saad [12] presented a practical approach for the so¬ 
lution of the Kohn-Sham (KS) equations using uniform 
grids combined with pseudo-potentials. What made the 
approach competitive was the use of high-order finite dif¬ 
ferences to control the error of the Laplacian without re¬ 
quiring very dense meshes. From that moment, several 
real-space implementations have been presented [T3H33]. 

Discretizing in real-space grids does not benefit from 
a direct physical connection to the system being simu¬ 
lated. However, the method has other advantages. In 
first place, a real-space discretization is, in most cases, 
straight-forward to perform starting from the continuum 
description of the electronic problem. Operations like in¬ 
tegration are directly translated into sums over the grid 
and differential operators can be discretized using finite 
differences. In fact, most electronic structure codes must 
rely on an auxiliary real-space discretization used, for ex¬ 
ample, for the calculation of the exchange and correlation 
(xc) term of DFT. 

Grids are flexible enough to directly simulate different 
kinds of systems: finite, and fully or partially periodic. It 
is also possible to perform simulations with reduced (or 
increased) dimensionality. Additionally, the discretiza¬ 
tion error can be systematically and continuously con¬ 
trolled by adjusting the spacing between mesh points, 
and the physical extension of the grid. 

The simple discretization and flexibility of the real 
space grids makes them an ideal framework to implement, 
develop and test new ideas. Modern electronic structure 
codes are quite complex, which means that researchers 
seldom can write code from scratch, but instead need to 
resort to existing codes to implement their developments. 

From the many codes available, in our experience the 
real-space code Octopus [22l El provides an ideal frame¬ 
work for theory-development work. To illustrate this 
point, in this article we will explore some recent advances 
that have been made in computational electronic struc¬ 
ture and that have been developed using the Octopus 
code as a base. We will pay special attention to the 
most unusual capabilities of the code, and in particular 
to the ones that have not been described in previous ar¬ 
ticles Eainissi. 


II. THE OCTOPUS CODE 

Octopus was started around 2000 in the group of pro¬ 
fessor Angel Rubio who, at that moment, was as the 
University of Valladolid, Spain. The first article using 
Octopus was published in 2001 [36]. Today, the code has 
grown to 200,000 lines of code. Octopus receives con¬ 
tributions from many developers from several countries 
and its results have been used for hundreds of scientific 
publications. 

The original purpose of Octopus was to perform real¬ 
time time-dependent density functional theory (TDDFT) 


calculations, a method that had been recently proposed 
at the time for the calculation of excited-state properties 
in molecules [37]. Beyond this original feature, over the 
time the code has become able to perform many types 
of calculations of ground-state and excited-state prop¬ 
erties. These include most of the standard features of 
a modern electronic-structure package and some not-so- 
common capabilities. 

Among the current capabilities of Octopus are an effi¬ 
cient real-time TDDFT implementation for both finite 
and periodic systems [38] [39]. Some of the research 
presented in this article is based on that feature, such 
as the simulation of photoemission, quantum optimal 
control, and plasmonic systems. The code can also 
perform molecular-dynamics simulations in the Born- 
Oppenheimer and Ehrenfest approximations. It also im¬ 
plements a modified Ehrenfest approach for adiabatic 
molecular dynamics gni EH that has favorable scaling 
for large systems. Octopus can perform linear-response 
TDDET calculations in different frameworks; these im¬ 
plementations are discussed in sections in and [V| Eor 
visualization, analysis and post-processing. Octopus can 
export quantities such as the density, orbitals, the cur¬ 
rent density, or the time-dependent electron localization 
function [42] to different formats, including the required 
DET data to perform GW/Bethe-Salpeter calculations 
with the BerkeleyGW code [43] . 

Octopus is publicly and freely available under the GPL 
free/open-source license, this includes all the releases as 
well as the development version. The code is written us¬ 
ing the principles of object oriented programming. This 
means that the code is quite flexible and modular. It 
provides a full toolkit for code developers to perform the 
operations required for the implementation of new ap¬ 
proaches for electronic-structure calculations. 

In order to control the quality of the package. Octopus 
uses continuous integration tools. The code includes a set 
of tests that checks most of the functionality by verifying 
the calculation results. After a new change is commited 
to the main repository, a set of servers with different con¬ 
figurations compiles the code and runs a series of short 
tests. This setup quickly detects most of the problems 
in a commit, from syntax that a compiler will not ac¬ 
cept, to unexpected changes in the results. Every night 
a more comprehensive set of tests is executed by these 
same servers. The test-suite framework is quite general 
and is also successfully in use for the BerkeleyGW [43] 
and APE [44] codes. 


III. THE STERNHEIMER FORMULATION OF 
LINEAR-RESPONSE 

In textbooks, perturbation theory is formulated in 
terms of sums over states and response functions. These 
are useful theoretical constructions that permit a good 
description and understanding of the underlying physics. 
However, this is not always a good starting point for nu- 
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merical applications, since it involves the calculation of 
a large number of eigenvectors, infinite sums over these 
eigenvectors, and functions that depend on two or more 
spatial variables. 

An interesting approach that avoids the problems men¬ 
tioned above is the formulation of perturbation theory in 
terms of differential equations for the variation of the 
wave-functions. In the literature, this is usually called 
the Sternheimer equation [45] or density functional per¬ 
turbation theory (DFPT) [46]. Although a perturbative 
technique, it avoids the use of empty states, and has a 
favorable scaling with the number of atoms. 

Octopus implements a generalized version of the Stern¬ 
heimer equation that is able to cope with both static 
and dynamic response in and out of resonance m- 
The method is suited for linear and non-linear response; 
higher-order Sternheimer equations can be obtained for 
higher-order variations. For second-order response, how¬ 
ever, we apply the 2n + 1 theorem (also known as 
Wigner’s 2n + 1 rule) [48] |49] to get the coefficients di¬ 
rectly from first-order response variations. 

In the Sternheimer formalism, we consider the response 
to a monochromatic perturbative field XSv{r) cos {ut). 
This perturbation induces a variation in the time- 
dependent KS orbitals, which we denote 5ipn{r^uj). 
These variations allow us to calculate response observ¬ 
ables, for example, the frequency-dependent polarizabil¬ 
ity. 

In order to calculate the variations of the orbitals we 
need to solve a linear equation that only depends on the 
occupied orbitals (atomic units are used throughout) 


We have found that it is not strictly necessary to 
project out the occupied subspace, the crucial part is 
simply to remove the projection of 5ipn (and any 

other states degenerate with it), which is not physically 
meaningful and arises from a phase convention. To fix 
the phase, it is sufficient to apply a minimal projector 
Pn = l- \^rn) {^m\- We optionally use this ap- 

proach to obtain the entire response wavefunction, not 
just the projection in the unoccupied subspace, which 
is needed for obtaining effective masses in k p theory. 
While the full projection can become time-consuming for 
large systems, it saves time overall since it increases the 
condition number of the matrix for the linear solver, and 
thus reduces the number of solver iterations required to 
attain a given precision. 

We also have implemented the Sternheimer formalism 
when non-integer occupations are used, as appropriate 
for metallic systems. In this case weighted projectors are 
added to both sides of eq. 0 [50] . We have generalized 
the equations to the dynamic case m- The modified 
Sternheimer equation is 


H -€n±UJ + iri + '^am \^Pm} {^m\ > ^^nir, ±w) = 


^F,n ^ ; l^n,m Iv^m) 


SH{±co)^r^{r), (4) 


where 


= max (ep — 3(T — e„, 0) 


( 5 ) 


- en±uj+ ir]'^ 5ipn{r,±Lo) = -PcSH{±uj)ipn{r), 

( 1 ) 

where the variation of the time-dependent density, given 

by 

5n{r,uj) = [^n{r)]* 5ipn{r,w)+[5ipn{r,-u:)]* ipn{r)^ 

k 

( 2 ) 

needs to be calculated self-consistently. The first-order 
variation of the KS Hamiltonian is 

5H{uj) = 5v{r) + [ 

J \r-r'\ 

+ j dr'f^cir,r\uj)5n{r\uj) . (3) 

Pc is a projection operator, and r] a positive infinitesi¬ 
mal, essential to obtain the correct position of the poles 
of the causal response function, and consequently to 
obtain the imaginary part of the polarizability and re¬ 
move the divergences of the equation for resonant fre¬ 
quencies. In the usual implementation of DFPT, Pc = 

1 — \^n) {^n\ which effectively removes the com¬ 

ponents of 5ipn{r^Puj) in the subspace of the occupied 
ground-state wave-functions. In linear response, these 
components do not contribute to the variation of the den¬ 
sity. 


/^n,m — ^F,n^n,m T 


^ ^F,n ^n,m ^ 
6n T ^ 


(6) 


a is the broadening energy, and Oij is the smear¬ 
ing scheme’s approximation to the Heaviside function 
B {{ci — €j) /a). Apart from semiconducting smearing 
{i.e. the original equation above, which corresponds to 
the zero temperature limit), the code offers the standard 
Fermi-Dirac [52], Methfessel-Paxton [53], spline [54], and 
cold [55] smearing schemes. Additionally, we have devel¬ 
oped a scheme for handling arbitrary fractional occupa¬ 
tions, which do not have to be defined by a function of 
the energy eigenvalues [51]. 

In order to solve eq. Q we use a self-consistent it¬ 
eration scheme similar to the one used for ground-state 
DFT. In each iteration we need to solve a sparse lin¬ 
ear problem where the operator to invert is the shifted 
KS Hamiltonian. For real wavefunctions and a real shift 
(as for the static case), we can use conjugate gradi¬ 
ents. When the shift is complex, a non-Hermitian it¬ 
erative solver is required. We have found that a robust 
and efficient solver is the quasi-minimal residual (QMR) 
method [56] . 

We can solve for linear response to various different 
perturbations. The most straight-forward case is the re¬ 
sponse of a finite system to an electric field with 
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frequency uj in the direction i, where the perturbation 
operator is 6v = fi m- In this case the polarizability 
can be calculated as 


ai 


H = -E 


/ 1^1 difri \ / difn 1^1 . 


(7) 


The calculation of the polarizability yields optical re¬ 
sponse properties (that can be extended to nonlinear re¬ 
sponse) [471 Eli and, for imaginary frequencies, van der 
Waals coefficients [58] , 

It is also possible to use the formalism to com¬ 
pute vibrational properties for finite and periodic sys¬ 
tems gSUSS]. Currently Octopus implements the calcu¬ 
lations of vibrations for finite systems. In this case the 
perturbation operator is an infinitesimal ionic displace¬ 
ment dHjdRia = dvajORia^ for each direction i and 
atom a. The quantity to calculate is the dynamical ma¬ 
trix, or Hessian, given by 


D 


d‘^E 


ia,j^ — 


dRiadRjp 


_ 7-\ion —ion 


occ 


n ■- 


dvo 


dR 

+<5a/3 ( 


dR 


4/3 


-h c.c. 


d^i 


dRicydR, 


aj 




(8) 


The contribution from the ion-ion interaction energy is 


7-\ion—ion _ 




-ZfyZ/^ 


-3 


(^Ria -Ri'y) 




q {Ricx-Rjfi) 
^ \Rc.-R^\^ 


a = P 

(9) 


where Z^^ is the ionic charge of atom a. We have found 
that an alternative expression for the perturbation oper¬ 
ator yields more accurate results when discretized. This 
is discussed in section IyD 

Vibrational frequencies uj are obtained by solving the 
eigenvalue equation 

, ( 10 ) 

y'rriamiB 

where rria is the mass for ion a. For a finite system of 
N atoms, there should be 3 zero-frequency translational 
modes and 3 zero-frequency rotational modes. However, 
they may appear at positive or imaginary frequencies, 
due to the finite size of the simulation domain, the dis¬ 
cretization of the grid, and finite precision in solution of 
the ground state and Sternheimer equation. Improving 
convergence brings them closer to zero. 

The Born effective charges can be computed from the 
response of the dipole moment to ionic displacement: 


^ d^E ^ d^i 

dSidR^a dRja 


— Za^ij 



d(Pn \ 

dRja / 


( 11 ) 


The intensities for each mode for absorption of radiation 
polarized in direction i, which can be used to predict 
infrared spectra, are calculated by multiplying by the 
normal-mode eigenvector x 


= (12) 

ja 

The Born charges must obey the acoustic sum rule, 
from translational invariance 

Zj ja = Qtot^ij • (13) 

a 


For each ij, we enforce this sum rule by distributing the 
discrepancy equally among the atoms, and thus obtaining 
corrected Born charges: 



(14) 


The discrepancy arises from the same causes as the non¬ 
zero translational and rotational modes. 

The Sternheimer equation can be used in conjunction 
with k • p perturbation theory [60] to obtain band veloci¬ 
ties and effective masses, as well as to apply electric fields 
via the quantum theory of polarization. In this case the 
perturbation is a displacement in the /c-point. Using the 
effective Hamiltonian for the /c-point 

Ffe = e-ik -rr ^ 


the perturbation is represented by the operator 


=-iV+ k + [va,r] , (16) 

including the effect on the non-local pseudopotentials. 
The first-order term gives the band group velocities in a 
periodic system. 


^nk — 

on 



(17) 


Inverse effective mass tensors can be calculated by solv¬ 
ing the Sternheimer equation for the k • p perturbation. 
The equation is not solved self-consistently, since the 
variation of /c-point is not a physical perturbation to the 
system; a converged k-giid should give the same density 
even if displaced slightly. The perturbation dH^/dk is 
purely anti-Hermitian. We use instead —idHk/dk to ob¬ 
tain a Hermitian perturbation, which allows the response 
to real wavefunctions to remain real. The effective mass 
tensors are calculated as follows: 


— 1 ^ ^nk 

rfYl ^ — _ 

dkSk^ 


— ^ij + 


^nk 


dHk 


dki 


d^nk \ 

W/ 


" 1 “ I [^25 ,'Oq,]] I . (IS) 
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The k • p wavefunctions can be used to compute the 
response to electric fields in periodic systems. In finite 
systems, a homogeneous electric field can be represented 
simply via the position operator r. However, this opera¬ 
tor is not well defined in a periodic system and cannot be 
used. According to the quantum theory of polarization, 
the solution is to replace rep with —id(fi/dk |6T], and then 
use this as the perturbation on the right hand side in the 
Sternheimer equation [62]. While this is typically done 
with a finite difference with respect to k gSlES] , we use 
an analytic derivative from a previous k • p Sternheimer 
calculation. Using the results in eq. 0 gives a formula 
for the polarization of the crystal: 


occ 

aij (w) = z y] 

n 


dki dSj^uj 


d^nk 

d£j-uj 



The polarizability is most usefully represented in a peri¬ 
odic system via the dielectric constant 


^ij 


^ij 


Att 


( 20 ) 


the coupling of the electrons to the magnetic field must 
be included in the Hamiltonian 






(23) 


The first part describes the orbital interaction with the 
field, and the second one is the Zeeman term that rep¬ 
resents the coupling of the electronic spin with the mag¬ 
netic field. 

As our main interest is the evaluation of the magnetic 
susceptibility, in the following, we consider a perturba¬ 
tive uniform static magnetic field B applied to a finite 
system with zero total spin. In the Coulomb gauge the 
corresponding vector potential. A, is given as 

A{r) = ^B X r . (24) 

In orders of B the perturbing potentials are 

= ^irxp)i = ^Li , (25) 


where V is the volume of the unit cell. This scheme can 
also be extended to non-linear response. 

We can compute the Born charges from the electric- 
field response in either finite or periodic systems (as 
a complementary approach to using the vibrational re¬ 
sponse): 


^ija 


d^E 

OEidRja 


OCC 

^OL^ij 

n 



dFja 

d£i 


[/ 

dvoc 

\ 1 


dRia 



( 21 ) 


This expression can be evaluated with the same approach 
as for the dynamical matrix elements, and is easily gen¬ 
eralized to non-zero frequency too. We can also make 
the previous expression eq. [^for Born charges from the 
vibrational perturbation usable in a periodic system with 
the replacement np —id(p/dk. 

Unfortunately, the k • p perturbation is not usable to 
calculate the polarization m, and a sum over strings of 
/c-points on a finer grid is required. We have implemented 
the special case of a T-point calculation for a large super¬ 
cell, where the single-point Berry phase can be used [64] . 
For cell sizes Li in each direction, the dipole moment is 
derived from the determinant of a matrix whose basis is 
the occupied KS orbitals: 


lii = -^Im In det ( (pr^ 

‘I'K ' 


^ — 2TTixil Li 




( 22 ) 


IV. MAGNETIC RESPONSE AND GAUGE 
INVARIANCE IN REAL-SPACE GRIDS 


with L the angular momentum operator, and 

- riTj) ■ (26) 

The induced magnetic moment can be expanded in 
terms of the external magnetic field which, to first or¬ 
der, reads 


mi = m° + J2xijBp\ (27) 

3 

where % is the magnetic susceptibility tensor. For finite 
systems the permanent magnetic moment can be calcu¬ 
lated directly from the ground-state wave-functions as 

m° = . (28) 


For the susceptibility, we need to calculate the first- 
order response functions in the presence of a magnetic 
field. This can be done in practice by using the mag¬ 
netic perturbation, eq. (|2^ , in the Sternheimer formal¬ 
ism described in section [Hlf If the system is time-reversal 
symmetric, since the perturbation is anti-symmetric un¬ 
der time-reversal (anti-Hermitian), it does not induce a 
change in the density and the Sternheimer equation does 
not need to be solved self-consistently. From there we 
find 

n 

(29) 


In the presence of a magnetic field B{r^t), generated 
by a vector potential A{r,t), additional terms describing 


Before applying this formalism in a calculation, however, 
we must make sure that our calculation is gauge invari¬ 
ant. 





















6 


In numerical implementations, the gauge freedom in 
choosing the vector potential might lead to poor con¬ 
vergence with the quality of the discretization, and to a 
dependence of the magnetic response on the origin of the 
simulation cell. In other words, an arbitrary translation 
of the molecule could introduce an nonphysical change in 
the calculated observables. This broken gauge-invariance 
is well known in molecular calculations with all-electron 
methods that make use of localized basis sets. In this 
case, the error can be traced to the finite-basis-set repre¬ 
sentation of the wave-functions [65l |66] . A simple mea¬ 
sure of the error is to check for the fulfillment of the 
hyper-virial relation m- 

i{^j\p\v>n) = (e„ - €j){(pj\r\(pn) , (30) 


where is the eigenvalue of the state ipn- 

When working with a real-space mesh, this problem 
also appears, though it is milder, because the standard 
operator representation in the grid is not gauge-invariant. 
In this case the error can be controlled by reducing the 
spacing of the mesh. On the other hand, real-space grids 
usually require the use of the pseudo-potential approx¬ 
imation, where the electron-ion interaction is described 
by a non-local potential v^\. This, or any other non¬ 
local potential, introduces a fundamental problem when 
describing the interaction with magnetic fields or vector 
potentials in general. To preserve gauge invariance, this 
term must be adequately coupled to the external electro¬ 
magnetic field, otherwise the results will strongly depend 
on the origin of the gauge. For example, an extra term 
has to be included in the hyper-virial expression, eq. (30), 
resulting in 


'^{^j\p\^n) — (^n (^j l^l^n) + I [“^5 f^nl] l^n) • (31) 

In general, the gauge-invariant non-local potential is 
given by 


{r\vni\r') = 


(r|{)ni|r')exp x,t) ■ 


da; 


(32) 


The integration path can be any one that connects the 
two points r and r', so an infinite number of choices is 
possible. 

In order to calculate the corrections required to the 
magnetic perturbation operators, we use two different 
integration paths that have been suggested in the lit¬ 
erature. The first was proposed by Ismail-Beigi, Chang, 
and Louie (ICL) [68] who give the following correction to 
the first-order magnetic perturbation term 

^ ( 33 ) 

and a similar term for the second-order perturbation. Us¬ 
ing a different integration path, Pickard and Mauri [69| 
proposed the GIPAW method, that has the form 

J^GIPAW ^ ^^mag _ ^ ^ ^ [r, {)“i] , (34) 


X 

B 20 -250.2 
B 38 -468.3 
B 44 -614.4 
Bso 219.3 
B 92 -831.3 

TABLE 1. Calculated magnetic susceptibilities (y in 
cgs ppm/mol) per number of boron atoms for the selected 
boron clusters shown in Fig. Results from Ref. izu 


where and respectively, the position and non¬ 

local potential of atom a. With the inclusion of either 
one of these methods, both implemented in Octopus, we 
recover gauge invariance in our formalism when pseudo¬ 
potentials are used. This allows us to predict the mag¬ 
netic susceptibility and other properties that depend on 
magnetic observables, like optical activity HQ]. 

A class of systems with interesting magnetic suscepti¬ 
bilities are fullerenes. For example, it is known that the 
Ceo fnllerene has a very small magnetic susceptibility due 
to the cancellation of the paramagnetic and diamagnetic 
responses HIIITII. Botti et al. HSj used the real-space im¬ 
plementation of Octopus to study the magnetic response 
of the boron fullerenes depicted in Fig. As shown in 
table [T| they found that, while most clusters are diamag¬ 
netic, Bgo is paramagnetic, with a strong cancellation of 
the paramagnetic and diamagnetic terms. 





FIG. 1. Structures of boron cages whose magnetic suscepti¬ 
bilities are given in table ^ 


V. LINEAR RESPONSE IN THE 
ELECTRON-HOLE BASIS 


An alternate approach to linear response is not to solve 
for the response function but rather for its poles (the 
excitation energies ujk) and residues {e.g. electric dipole 
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matrix elements dk) m- The polarizability is given by 

• dk^ • dj^^ • djJ^ • d]^^ 

ujk — 00 — iS cj/c + cj + 

(35) 

and the absorption cross-section is 

aij (oj) = - a Im aij (oo) , (36) 

where a is the fine-structure constant. The simplest ap¬ 
proximation to use is the random-phase approximation 
(RPA), in which the excitation energies are given by the 
differences of unoccupied and occupied KS eigenvalues, 
^cv = ec — ey. The corresponding dipole matrix elements 
are dev = (^c |^| [IS]. (As implemented in the code, 

this section will refer only to the case of a system without 
partially occupied levels.) 

The RPA is not a very satisfactory approximation, 
however. The full solution within TDDFT is given by 
a non-Hermitian matrix eigenvalue equation, with a ba¬ 
sis consisting of both occupied-unoccupied {v c) and 
unoccupied-occupied (c ^ v) KS transitions. The equa¬ 
tion reads as 


neglected and thus we need only consider the occupied- 
unoccupied transitions. The matrix equation is reduced 
to a Hermitian problem of half the size of the full prob¬ 
lem: 

Ax = oox . (40) 

Interestingly, the Tamm-Dancoff approximation is often 
found to give superior results to the full solution, for ex¬ 
ample for molecular potential-energy surfaces or when 
hybrid functionals are used, which can suffer from a 
“triplet instability” in which the lowest triplet state is 
lower in energy than the ground state m- The dipole 
matrix elements are now a superposition of the KS ones: 

dk — ^ ^ dev Xqv • ('^^) 

cv 

When the wavefunctions are real, the full problem can 
be collapsed into a Hermitian one of the same size as 
the Tamm-Dancoff matrix, known as Casida’s equation 

izaiHo]. 

(^c ^v) ^cc'^vv' ff ^c' ^v' ^cve'v' '\/^ ^cv • 

(42) 

The dipole matrix elements are 


i‘^) = 

k 



X = UJX 


(37) 



(43) 


where the A matrices couple v ^ c transitions among 
themselves and c ^ v among themselves, while the B 
matrices couple the two types of transitions. They have 
the form m 

I (}Pv' I ^ \^c) \^v) — (^c ^v) ^cc'^vv' (^^) 

+ I {^v' I + /xc l^c) \^v) 5 

(*^c' I I B \Pc) \Pv) — {Pc' I {Pv' I '5c + /xc \Pc) \Pv) • 

(39) 

where Vc is the Coulomb kernel, and /xc is the exchange- 
correlation kernel (currently only supported for LDA- 
type functionals in Octopus). 

We do not solve the full equation in Octopus, but pro¬ 
vide a hierarchy of approximations. An example calcula¬ 
tion for the N 2 molecule with each theory level is shown 
in Table [ll| The lowest approximation we use is RPA. The 
next is the single-pole approximation of Petersilka et al 
m, in which only the diagonal elements of the matrix 
are considered. Like in the RPA case, the eigenvectors 
and dipole matrix elements are simply the KS transitions. 
The positive eigenvalues are ojev = Cc — + Aevcv This 

can be a reasonable approximation when there is little 
mixing between KS transitions, but generally fails when 
there are degenerate or nearly degenerate transitions. 

A next level of approximation is the Tamm-Dancoff 
approximation to TDDFT m in which the B blocks are 


An alternate approach for finding excitation energies 
is to look for many-body eigenstates of the DFT Hamil¬ 
tonian which are orthogonal to the ground state. In the 
“second-order constrained variational” or CV(2) theory 
jST] . second-order perturbation theory from the ground- 
state density yields equations quite similar to the linear- 
response approach, despite their different origin: 


A 

B 


-A 


We implement the case of real wavefunctions and eigen¬ 
vectors, in which case (as for Casida’s equation) a Her¬ 
mitian matrix equation for only the occupied-unoccupied 
transitions can be written: 

{Ap B)x = oox . (45) 

The Tamm-Dancoff approximation to these equations is 
identical to the ordinary TDDFT Tamm-Dancoff approx¬ 
imation. 

Note that all the levels of theory we have discussed 
use the same Coulomb and /xc matrix elements, so the 
code can calculate the results for multiple levels of theory 
with a small extra effort. We can also consider alterna¬ 
tive perturbations in this framework beyond the dipole 
approximation for properties such as inelastic X-ray scat¬ 
tering [82]. 


















RPA Petersilka TDA 

Casida CV(2) 

Exp’t 

8.234 

9.421 

9.343 

9.254 

9.671 

9.309 

8.234 

9.421 

9.343 

9.254 

10.279 

9.309 

9.671 

9.671 

9.671 

9.671 

10.279 

9.921 

9.671 

10.241 

10.237 

10.221 

10.792 

10.270 

9.671 

10.245 

10.241 

10.224 

10.801 

10.270 

9.671 

11.028 

10.931 

10.921 

11.077 

12.199 


TABLE IL The first 6 excitation energies (in eV) for the 
N 2 molecule with different approximations to TDDFT in the 
electron-hole basis: the random phase approximation (RPA), 
Petersilka, Tamm-Dancoff approximation (TDA), Casida and 
CV(2). The VWN LDA parametrization [83] was used for the 
exchange-correlation functional, the bond length is 1.098 A, 
the real-space grid was a sphere of radius 7.4 A with spacing 
0.16 A, and 16 unoccupied states were used. The experimen¬ 
tal data is from Ref. [84] . 


For a non-spin-polarized system, the excitations sep¬ 
arate into a singlet and a triplet subspace, which are 
superpositions of singlet and triplet KS transitions: 



(46) 


(47) 


The signs are reversed from the situation for a simple pair 
of electrons, since we are instead dealing with an electron 
and a hole. There are of course two other triplet excita¬ 
tions {m = ±1) which are degenerate with the m = 0 one 
above. Rather than performing spin-polarized ground- 
state and linear-response calculations, we can use the 
symmetry between the spins in a non-spin-polarized sys¬ 
tem to derive a form of the kernel to use in obtaining 
singlet and triplet excitations m 

+ Lc = {^\vc + \^p} 

= (</?|'0c + 2/xc |</^) (48) 

I + /xc = (^1 fll - B b) • (49) 

These kernels can be used in any of the levels of the¬ 
ory above: RPA, Petersilka, Tamm-Dancoff, Casida, and 
CV(2). The corresponding electric dipole matrix ele¬ 
ments are as in the spin-polarized case for singlet exci¬ 
tations. For triplet excitations, they are identically zero, 
and only higher-order electromagnetic processes can ex¬ 
cite them. 

There are three main steps in the calculation: calcu¬ 
lation of the matrix, diagonalization of the matrix, and 
calculation of the dipole matrix elements. The first step 
generally takes almost all the computation time, and is 
the most important to optimize. Within that step, the 
Coulomb part (since it is non-local) is much more time- 
consuming than the /xc part. We calculate it by solving 
the Poisson equation (as for the Hartree potential) for 
each column of the matrix, to obtain a potential P for 


the density (pc (r)* (fy (r), and then for each row comput¬ 
ing the matrix element as 


{ipc'ipv'\v\ipcipv) = dr(pc'{r)<fy,{r)P[(pc<fv] ■ (50) 


Our basic parallelization strategy for computation of 
the matrix elements is by domains, as discussed in sec¬ 
tion |XV| but we add an additional level of parallelization 
here over occupied-unoccupied pairs. We distribute the 
columns of the matrix, and do not distribute the rows, 
to avoid duplication of Poisson solves. We can reduce 
the number of matrix elements to be computed by al¬ 
most half using the Hermitian nature of the matrix, i.e. 
Mcv.c'v' = ^c'v' cv' there are N occupied-unoccupied 
pairs, there are N diagonal matrix elements, and the 
N {N — 1) /2 remaining off-diagonal matrix elements are 
distributed as evenly as possible among the columns. If 
N — 1 is even, there are {N — 1) /2 for each column; if 
N — 1 is odd, half of the columns have N/2 — 1 and half 
have N/2. See Fig. for examples of the distribution. 
The columns then are assigned to the available proces¬ 
sors in a round-robin fashion. The diagonalization step is 
performed by direct diagonalization with LAPACK [85| 
in serial; since it generally accounts for only a small part 
of the computation time, parallelization of this step is not 
very important. The final step is calculation of the dipole 
matrix elements, which amounts to only a small part of 
the computation time, and uses only domain paralleliza¬ 
tion. Note that the triplet kernel lacks the Coulomb term, 
and so is considerably faster to compute. 

Using the result of a calculation of excited states by 
one of these methods, and a previous calculation of vi¬ 
brational modes with the Sternheimer equation, we can 
compute forces in each excited state, which can be used 
for excited-state structural relaxation or molecular dy¬ 
namics [86] . Our formulation allows us to do this without 
introducing any extra summations over empty states, un¬ 
like previous force implementations [87H89] . The energy 
of a given excited state k is a sum of the ground-state 
energy and the excitation energy: = Eq ^ uJk- The 

force is then given by the ground-state force, minus the 
derivative of the excitation energy: 


dEk _ dcok 

op. fir?. 


(51) 


Using the Hellman-Feynman Theorem we find the last 
term without introducing any additional sums over un¬ 
occupied states. In the particular case of the Tamm- 
Dancoff approximation we have 


diOk 

dRia 



dA 

dRia 



( 52 ) 
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and 






dA 


dRio 


) = \ ‘Vc 


dH 


dRin 




dH 

\ 

\x , / 


dRia 

Pv' 

) dec' + ( 

K 


iff./ (py! 


(53) 


Analogous equations apply for the difference of eigenval¬ 
ues, Petersilka, and CV(2) theory levels. (The slightly 
more complicated Casida case has not yet been imple¬ 
mented.) The Coulomb term, with no explicit depen¬ 
dence on the atomic positions, does not appear, leading 
to a significant savings in computational time compared 
to the calculation of the excited states. 


6x6 


7x7 



FIG. 2. Distribution of matrix elements to be calculated 
among the columns, using Hermiticity of the response ma¬ 
trix. The columns are then distributed among the available 
MPI groups for electron-hole parallelization. The number of 
matrix elements to be calculated per column is equal for an 
odd size, and uneven for an even size. 


VI. FORCES AND GEOMETRY 
OPTIMIZATION ON REAL-SPACE GRIDS 


A function represented on a real-space grid is not in¬ 
variant under translations as one would expect from a 
physical system. The potential of an atom sitting on top 
of a grid point might be slightly different from the po¬ 
tential of the same atom located between points. This 
implies that a rigid displacement of the system produces 
an artificial variation of the energy and other properties. 
If we plot the energy of the atom as a function of this 
rigid displacement, the energy shows an oscillation that 
gives this phenomenon the name of the “egg-box effect”. 

The egg-box effect is particularly problematic for cal¬ 
culations where the atoms are allowed to move, for ex¬ 
ample to study the dynamics of the atoms (molecular 
dynamics) or to find the minimum energy configuration 
(geometry optimization). 


In Octopus we have studied several schemes to control 
the egg-box effect m- The first step is to use pseudo¬ 
potential filtering to eliminate Fourier components of the 
potential that cannot be represented on the grid m- 
Additionally, we have found a formulation for the 
forces that reduces the spurious effect of the grid on the 
calculations. One term in the forces is the expectation 
value of the derivative of the ionic potential with respect 
to the ionic position which can be evaluated as 


IP — IP 

rv r 


non —ion 



dvcx 

dRa 



(54) 


(For simplicity, we consider only local potentials here, 
but the results are valid for non-local potentials as well.) 
This term can be rewritten such that it does not include 
the derivative of the ionic potential Va, but the gradi¬ 
ent of the orbitals with respect to the electronic coordi¬ 
nates [9^ : 


jp — jp 


ion—ion 


+ E 


n 



Ta I 


-h c.c. 


(55) 


The first advantage of this formulation is that it is easier 
to implement than eq. (54), as it does not require the 
derivatives of the potential, which can be quite complex 
and difficult to code, especially when relativistic correc¬ 
tions are included. However, the main benefit of using 
eq. is that it is more precise when discretized on a 
grid, as the orbitals are smoother than the ionic poten¬ 
tial. We illustrate this point in Fig. where the forces 
obtained with the two methods are compared. While 
taking the derivative of the atomic potential gives forces 
with a considerable oscillation due to the grid, using the 
derivative of the orbitals gives a force that is considerably 
smoother. 



J_^^^^_L_ 

2 2.5 3 3.5 

Interatomic distance [a.u.] 


FIG. 3. Galculation of the interatomic force for N 2 . Solid 
(red) line: force calculated from the derivative of the ionic po¬ 
tential with respect to the atomic position. Segmented (blue) 
line: force calculated from spatial derivatives of the molecular 
orbitals. Grid spacing of 0.43 Bohr. 
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This alternative formulation of the forces can be ex¬ 
tended to obtain the second-order derivatives of the en¬ 
ergy with respect to the atomic displacements [90] , which 
are required to calculate vibrational properties as dis¬ 
cussed in section [TTT| In general, the perturbation opera¬ 
tor associated with an ionic displacement can be written 
as 


Using this expression, the terms of the dynamical matrix, 
eq. ([^, are evaluated as 




dVa 

dpn \ 


dRia 

dRjJ 



+ 

/ d(pn 
\ dvi 


\Va\ 

^ I 


dRn 


dRjpdri 

+ c.c. , (57) 


d/3 


and 




d'^Va 

\ \ 

^Ria^Rja 

) — 


dvidrj 

_L I.', I 


Ta I 


+ C.C. 


(58) 


With our approach, the forces tend to converge faster 
with the grid spacing than the energy. This means that 
to perform geometry optimizations it would be ideal to 
have a local minimization method that only relies on the 
forces, without needing to evaluate the energy, as both 
values will not be entirely consistent. Such a method is 
the fast inertial relaxation engine (FIRE) algorithm, put 
forward by Bitzek et al [93]. FIRE has shown a com¬ 
petitive performance compared with both the standard 
conjugate-gradient method, and more sophisticated vari¬ 
ations typically used in ab initio calculations. A recent 
article shows also the FIRE as one of the most conve¬ 
nient algorithm due to its speed and precision to reach 
the nearest local minimum starting from a given initial 
configuration [94] . 

The FIRE algorithm is based on molecular dynamics 
with additional velocity modifications and adaptive time 
steps which only requires first derivatives of the target 
function. In the FIRE algorithm, the system slides down 
the potential-energy surface, gathering “momentum” un¬ 
til the direction of the gradient changes, at which point it 
stops, resets the adaptive parameters, and resumes slid¬ 
ing. This gain of momentum is done through the modi¬ 
fication of the time step At as adaptive parameter, and 
by introducing the following velocity modification 

v(t) ^ V(t) = (1 - (a)v(t) + a |v(t)| F(t) , (59) 


where v is the velocity of the atoms, a is an adaptive 
parameter, and F is a unitary vector in the direction of 
the force F. By doing this velocity modification, the 


acceleration of the atoms is given by 


v(i) 


m 

m 


a 


— |v(t)| \Y{t)-F{t) 


(60) 


where the second term is an introduced acceleration in 
a direction “steeper” than the usual direction of mo¬ 
tion. Obviously, if a = 0 then V(t) = v(t), meaning 
the velocity modification vanishes, and the acceleration 
v(t) = F{t)/m, as usual. 

We illustrate how the algorithm works with a simple 
case: the geometry optimization of a methane molecule. 
The input geometry consists of one carbon atom at the 
center of a tetrahedron, and four hydrogen atoms at the 
vertices, where the initial C-H distance is 1.2 A. In Fig.[^ 
we plot the energy difference AFtot with respect to the 
equilibrium conformation, the maximum component of 
the force acting on the ions F^ax, and the C-H bond 
length. On the first iterations, the geometry approaches 
the equilibrium position, but moves away on the 3rd. 
This means a change in the direction of the gradient, so 
there is no movement in the 4th iteration, the adaptive 
parameters are reset, and sliding resumes in the 5th it¬ 
eration. 



FIG. 4. Geometry optimization of a methane molecule with 
FIRE. Top panel (orange squares): energy difference AFtot 
with respect to the equilibrium geometry. Middle panel (blue 
circles): maximum component of the force Fmax acting on 
the ions. Bottom panel (green diamonds): G-H bond length. 
Grid spacing is 0.33 Bohr. 


VII. PHOTOEMISSION 


Electron photoemission embraces all the processes 
where an atom, a molecule or a surface is ionized under 
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the effect of an external electromagnetic field. In experi¬ 
ments, the ejected electrons are measured with detectors 
that are capable of characterizing their kinetic proper¬ 
ties. Energy-resolved, P{E), and momentum-resolved, 
P(/c), photoemission probabilities are quite interesting 
observables since they carry important information, for 
instance, on the parent ion [95j |96] or on the ionization 
process itself m- The calculation of these quantities is a 
difficult task because the process requires the evaluation 
of the total wavefunction in an extremely large portion 
of space (in principle a macroscopic one) that would be 
impractical to represent in real space. 

We have developed a scheme to calculate photoe¬ 
mission based on real-time TDDFT that is currently 
implemented in Octopus. We use a mixed real- and 
momentum-space approach. Each KS orbital is propa¬ 
gated in real space on a restricted simulation box, and 
then matched at the boundary with a momentum-space 
representation. 

The matching is made with the help of a mask func¬ 
tion M(r), like the one shown in Fig. that separates 
each orbital into a bounded and an unbounded 

component as follows: 

(l)i{r,t) = M(r)0i(r,t) + [1 - M(r)] ^^(r, t) . (61) 

"-V-" -V-" 

Starting from a set of orbitals localized in A at t = 0 
it is possible to derive a time-propagation scheme with 
time step At by recursively applying the discrete time- 
evolution operator U(At) = U (t +At, t) and splitting the 
components with eq. ( [M] ). The result can be written in 
a closed form for 0^(r7t), represented in real space, and 
(tu,t), in momentum space, with the following struc¬ 
ture: 


(j)f{r,t + At) =(pf{r,t + At) + (pf{r,t + At) , 
(pf{k,t + At) = 'df{k,t + At) + 'df{k,t + At) , 

and the additional set of equations, 

+ At) = MU{At)(j)f{r, t) , (63) 

^fiT,t +At) = j dfce‘'“'^?7v(Ai)^f(fc,t) , 

(64) 

+ At) = 

^21^/2 j - M)U{At)4>f{r,t) , (65) 

+ At) = lJ^{At)(j)f {k, t) 

~ 12^ I d^e-‘'“-Vf(r,i +Ai) . (66) 

The momentum-resolved photoelectron probability is 
then obtained directly from the momentum components 
as [98] 


p{k) 


N 

l_im y]|(/)f(fe,i)|2, 


i 


(67) 




FIG. 5. Scheme illustrating the mask method for the calcu¬ 
lation of electron photoemission. A mask function (a) is used 
to effectively split each Kohn-Sham orbital into bounded and 
unbounded components localized in different spatial regions 
A and B according to the diagram in (b). In A the states 
are represented on a real-space grid while in B they are de¬ 
scribed in momentum space. A striped region indicates the 
volume where the two representations overlap. The propaga¬ 
tion scheme of eqs. (62) and ([^ allows seamless transitions 


from one representations to the other and is capable to de¬ 
scribe electrons following closed trajectories like the one in 
(b). 


while the energy-resolved probability follows by direct 
integration, P{E) = 

In eq. ( [^ we introduced the Volkov propagator 
Uy{At) for the wavefunctions in B. It is the time- 
evolution operator associated with the Hamiltonian 
describing free electrons in an oscillating field. Given 
a time dependent vector field A(t), the Hamiltonian 

^ iV — expressed in the velocity gauge 

is diagonal in momentum and can be naturally applied 


to (f)f{k,t). 

For all systems that can be described by a Hamilto¬ 
nian such that H{r^ t) = Hy{r, t) for r G H and all time 
t, eqs. (62) and ( j^ are equivalent to a time propagation 
in the entire space AVJ B. In particular, it exactly de¬ 
scribes situations where the electrons follow trajectories 
crossing the boundary separating A and B as illustrated 
in Fig.[5];b). 

In Octopus we discretize eq. (63) in real and mo men- 
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turn space and co-propagate the complete set of orbitals 
(j)f{r^t) and (j)f{k^t). The propagation has to take care 
of additional details since the discretization can introduce 
numerical instability. In fact, substituting the Fourier in¬ 
tegrals in (63) with Fourier sums (usually evaluated with 
FFTs) imposes periodic boundary conditions that spu¬ 
riously reintroduces charge that was supposed to disap¬ 
pear. This is illustrated with a one-dimensional example 
in Fig. I^a) where a wavepacket launched towards the 
left edge of the simulation box reappears from the other 
edge. 

An alternative discretization strategy is zero padding. 
This is done by embedding the system into a simulation 
box enlarged by a factor a > 1, extending the orbitals 
with zeros in the outer region as shown in Fig. [^b). In 
this way, the periodic boundaries are pushed away from 
the simulation box and the wavepackets have to travel an 
additional distance 2 {a — 1)L before reappearing from 
the other side. In doing so, the computational cost is 
increased by adding {a — l)n points for each orbital. 

This cost can be greatly reduced using a special grid 
with only two additional points placed at TaL as shown 
in Fig. I^c). Since the new grid has non uniform spacing 
a non-equispaced FFT (NFFT) is used j9^ lIQQj . With 
this strategy, a price is paid in momentum space where 
the maximum momentum /Cmax is reduced by a factor a 
compared to ordinary FFT. In Octopus we implemented 
all three strategies: bare FFT, zero padding with FFT 
and zero padding with NFFT. 

All these discretization strategies are numerically sta¬ 
ble for a propagation time approximately equivalent to 
the time that it takes for a wavepacket with the highest 
momentum considered to be reintroduced in the simu¬ 
lation box. For longer times we can employ a modified 
set of equations. It can be derived from (68) under the 
assumption that the electron flow is only outgoing. In 
this case we can drop the equation for ipf responsible for 
the ingoing flow and obtain the set 


At) = MU{At)(t)f{r,t) , 

+ = 0 , 

'df{k,t +At) = Uv{At)(f>f{k,t) . 


M)U{At)(pf{r,t) 


( 68 ) 


This new set of equations together with (62) lifts the pe¬ 


riodic conditions at the boundaries and secures numerical 
stability for arbitrary long time propagations. A conse¬ 
quence of this approximation is the fact that the removal 
of charge is performed only in the equation for (pf by 
means of a multiplication by M{r). This is equivalent 
to the use of a mask function boundary absorber that 
is known to present reflections in an energy range that 
depends on M{r) |IQIj . Carefully choosing the most ap¬ 
propriate mask function thus becomes of key importance 
in order to obtain accurate results. 

We conclude briefly summarizing some of the most im¬ 
portant features and applications of our approach. The 


method allows us to retrieve P(/c), the most resolved 
quantity available in experiments nowadays. In addi¬ 
tion, it is very flexible with respect to the definition of 
the external field and can operate in a wide range of 
situations. In the strong-field regime, it can handle in¬ 
teresting situations, for instance, when the electrons fol¬ 
low trajectories extending beyond the simulation box, or 
when the target system is a large molecule. This consti¬ 
tutes a step forward compared to the standard theoreti¬ 
cal tools employed in the field which, in the large major¬ 
ity of cases, invoke the single-active-electron approxima¬ 
tion. In Ref. [98] the code was successfully employed to 
study the photoelectron angular distributions of nitrogen 
dimers under a strong infrared laser field. The method 
can efficiently describe situations where more than one 
laser pulse is involved. This includes, for instance, time- 
resolved measurements where pump and probe setups are 
employed. In Ref. m Octopus was used to monitor 
the time evolution of the tt ^ tt* transition in ethylene 
molecules with photoelectrons. The study was later ex¬ 
tended to include the effect of moving ions at the classical 
level |IQ3j . Finally, we point out that our method is by 
no means restricted to the study of light-induced ioniza¬ 
tion but can be applied to characterize ionization induced 
by other processes, for example, ionization taking place 
after a proton collision. 


(a) 


(b) 


real space 

A 


FFT 

I 

I 

I 

I 

I 


momentum space 




Ax \ Ak = 27^IL 

I 

FFT - zero padding 

I 

B 


/j^max = nAk/2 



an 

aL I A/c = 2'KlaL fcmax = anAk/2 

(c) 


I 

NFFT - zero padding 

I 



i 

-A^ 

I 
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m < 


— _-• I ‘ 




aL ' AA: = 27r/aL fcmax = (n + 2)Ak/2 


FIG. 6. S chem e illustrating different discretization strate¬ 
gies for eq. ( |63| ) in one dimension. In all the cases an initial 
wavepacket (green) is launched towards the left side of a sim¬ 
ulation box of length L and discretized in n sampling points 
spaced by Ax. A and B indicate the space partitions corre¬ 
sponding to Fig.[^ Owing to the discretization of the Fourier 
integrals, periodic conditions are imposed at the boundaries 
and the wavepacket wraps around the edges of the simula¬ 
tion box (red). The time evolution is portrayed together with 
a momentum-space representation (yellow), with spacing Ak 
and maximum momentum /cmax, in three situations differing 
in the strategy used to map real and momentum spaces: (a) 
Fast Fourier Transform (FFT), (b) FFT extended with zeros 
(zero padding) in a box enlarged by a factor a, and (c) zero 
padding with NFFT. 
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VIII. COMPLEX SCALING AND RESONANCES 

In this section we discuss the calculation of reso¬ 
nant electronic states by means of the complex-scaling 
method, as implemented in Octopus. By “resonant 
states,” we mean metastable electronic states of finite 
systems, such as atoms or molecules, with a characteris¬ 
tic energy and lifetime. 

Mathematically, resonances can be defined as poles of 
the scattering matrix or cross-section at complex ener¬ 
gies [TnnirnH] . If a pole is close to the real energy axis, 
it will produce a large, narrow peak in the cross-section 
of scattered continuum states. One way resonances can 
arise is from application of an electric field strong enough 
to ionize the system through tunnelling. Resonant states 
may temporarily capture incoming electrons or electrons 
excited from bound states, making them important in¬ 
termediate states in many processes. 

The defining characteristic of a resonant state, often 
called a Siegert state m, is that it has an outgoing 
component but not an incoming one. They can be deter¬ 
mined by solving the time-independent Schrodinger equa¬ 
tion with the boundary condition that the wavefunction 
must asymptotically have the form 

gi/cr 

'0(^) ^ r ^ oo , (69) 

r 

where the momentum k is complex and has a negative 
imaginary part. This causes the state to diverge expo¬ 
nentially in space as r ^ oc. The state can further be 
ascribed a complex energy, likewise with a negative imag¬ 
inary part, causing it to decay over time at every point 
in space uniformly. 

Resonant states are not eigenstates of any Hermi- 
tian operator and in particular do not reside within the 
Hilbert space. This precludes their direct calculation 
with the standard computational methods from DFT. 
However, it turns out that a suitably chosen analytic 
continuation of a Siegert state is localized, and this form 
can be used to derive information from the state. This 
is the idea behind the complex-scaling method prmiToT] 
where states and operators are represented by means of 
the transformation 

Rg t/,(r) = , (70) 

where N is the number of spatial dimensions to which the 
scaling operation is applied, and 6> is a fixed scaling angle 
which determines the path in the complex plane along 
which the analytic continuation is taken. The transfor¬ 
mation maps the Hamiltonian to a non-Hermitian oper¬ 
ator He = ReHR-e- 

The Siegert states 'ip{r) of the original Hamiltonian 
are square-integrable eigenstates of and their 

eigenvalues eo — ir/2 define the energy eo and width T of 
the resonance |IQ8I[IIQj . 

A typical example of a spectrum of the transformed 
Hamiltonian Hq is shown in Fig. and the correspond¬ 
ing potential and lowest bound and resonant states in 


Fig-i The bound-state energies are unchanged while 
the continuum rotates by —20 around the origin. Fi¬ 
nally, resonances appear as isolated eigenvalues in the 
fourth quadrant once 0 is sufficiently large to “uncover” 
them from the continuum. Importantly, matrix elements 
(and in particular energies) of states are independent of 
0 as long as the states are localized and well represented 
numerically — this ensures that all physical bound-state 
characteristics of the untransformed Hamiltonian are re¬ 
tained. 

Our implementation supports calculations with com¬ 
plex scaling for independent particles or in combination 
with DFT and selected xc functionals m- The energy 
functional in KS-DFT consists of several terms that are 
all expressible as integrals of the density or the wavefunc- 
tions with the kinetic operator and various potentials. 
The functional is complex-scaled as per the prescribed 
method by rotating the real-space integration contour of 
every term by 0 in the complex plane. The DFT energy 
functional becomes 


E0 = e f ^rifienir) (“^vb V’0n{r) 


- 1 // 


dr dr 


, n g{r)n0{r') 
|r — r'\ 


+ E'^Jjig\ + Jdrve^t{re'^)ng{r) , (71) 

with the now-complex electron density 

Mr) = , (72) 


with occupation numbers /n, and complex-scaled KS 
states (peni'J^)- Note that no complex conjugation is per¬ 
formed on the left component in matrix elements such 
as the density or kinetic energy. In order to define the 
complex-scaled xc potential, it is necessary to perform an 
analytic continuation procedure uni- 

In standard DFT, the KS equations are obtained by 
taking the functional derivative of the energy functional 
with respect to the density. Solving the equations cor¬ 
responds to searching for a stationary point, with the 
idea that this minimizes the energy. In our case, since 
the energy functional is complex-valued ma , we cannot 
minimize the energy functional, but we can still search 
for stationary points to find the resonances ma HU]. 
The complex-scaled version of the KS equations thereby 
becomes similar to the usual ones: 




^en{r) = ipen{r)een • 


(73) 


The effective potential ve{r) is the functional derivative 
of the energy functional with respect to the density ne{r)^ 
and, therefore, consists of the terms 

Mr) = = rliir) + v^^{r) + v^MrM) , (74) 
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where may represent atomic potentials as an¬ 

alytically continued pseudopotentials, and where the 
Hartree potential 


v^(r) = e 


-10 



ne{r') 
\r’ - r| 


(75) 


is determined by solving the Poisson equation defined by 
the complex density. Together with the xc potential, 


zir) 


5El[ne\ 

8n0{r) 


(76) 


this defines a self-consistency cycle very similar to or¬ 
dinary KS DFT, although more care must be taken to 
occupy the correct states, as they are no longer simply 
ordered by energy. 

Fig. H shows calculated ionization rates for the 
He Is state in a uniform St ark-type electric field as a 
function of field strength. In the limit of weak electric 
fields, the simple approximation by Ammosov, Delone 
and Krainov (ADK) |115j . which depends only on the 
ionization potential, approaches the accurate reference 
calculation by Scrinzi and co-workers |116j . This demon¬ 
strates that the ionization rate is determined largely by 
the ionization potential for weak fields. As the local den¬ 
sity approximation is known to produce inaccurate ion¬ 
ization potentials due to its wrong asymptotic form at 
large distances, it necessarily yields inaccurate rates at 
low fields. Meanwhile exact exchange, which is known 
to produce accurate ionization energies, predicts ioniza¬ 
tion rates much closer to the reference calculation. The 
key property of the xc functional that allows accurate 
determination of decay rates from complex-scaled DFT 
therefore appears to be that it must yield accurate ioniza¬ 
tion potentials, which is linked to its ability to reproduce 
the correct asymptotic form of the potential at large dis¬ 
tances from the system nm. 



X [a.u.] 


FIG. 8. Potential (blue) and the real (solid) and imagi¬ 
nary (dotted) parts of the two bound (green) and three low¬ 
est resonant (red) wavefunctions. For improved visualization, 
the wavefunctions are vertically displaced by the real parts of 
their energies. 



Field strength [a.u.] 

FIG. 9. Ionization rates of the He atom in strong electric 
fields using the local density approximation (LDA) and exact 
exchange (EXX), compared to an accurate numerical refer¬ 
ence nm as well as the analytic ADK approximation [na. 
Results from Ref. 


IX. QUANTUM OPTIMAL CONTROL 
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FIG. 7. Spectrum of one-dimensional complex-scaled single¬ 
particle Hamiltonian with potential v{x) = 3(x^ — 2)e“^ 
and Q — 0.5. The lowest-energy resonance, here located close 
to the origin, does not lie exactly on the real axis but has an 
imaginary part of about —10“^. 
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In recent years, we have added to Octopus some of 
the key advancements of quantum optimal-control the¬ 
ory (QOCT) |118[ I119j . In this section, we will briefly 
summarize what this theory is about, overview the cur¬ 
rent status of its implementation, and describe some of 
the results that have been obtained with it until now. 

Quantum control can be loosely defined as the manip¬ 
ulation of physical processes at the quantum level. We 
are concerned here with the theoretical branch of this 
discipline, whose most general formulation is precisely 
QOCT. This is, in fact, a particular case of the general 
mathematical field of “optimal control”, which studies 
the optimization of dynamical processes in general. The 
first applications of optimal control in the quantum realm 
appeared in the 80s |12QH122] . and the field has rapidly 
evolved since then. Broadly speaking, QOCT attempts 
to answer the following question: given a quantum pro¬ 
cess governed by a Hamiltonian that depends on a set of 
parameters, what are the values of those parameters that 
maximize a given observable that depends on the behav- 
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ior of the system? In mathematical terms: let a set of 
parameters ,..., um = u determine the Hamiltonian 
of a system H[u^t], so that the evolution of the system 
also depends on the value taken by those parameters: 

= ( 77 ) 

1^(0)) = IV-o), (78) 


i.e. the solution of the Schrodinger equation determines 
a map u — 'ip[u]. Suppose we wish to optimize a func¬ 
tional of the system F = F[ip]. QOCT is about find¬ 
ing the extrema of G{u) = F['0[i4]]. Beyond this search, 
QOCT also studies topics such as the robustness of the 
optimal solutions for those parameters, the number of 
solutions, or the construction of suitable algorithms to 
compute them. 

Perhaps the most relevant result of QOCT is the equa¬ 
tion for the gradient of G, which allows use of the var¬ 
ious maximization algorithms available. For the simple 
formulation given above, this gradient is given by 


dG 

OUm 


(u) = 2 Im 




dH 

dUm 


[u,t] 



where y is the costate, an auxiliary wave function that is 
defined through the following equation of motion: 

, (80) 

This equation assumes, in order to keep this description 
short, that the target functional F depends on the state 
of the system only at the final time of the propagation 
T, i.e. it is a functional of ip{T). Note the presence of a 
boundary value equation at the final time of the propaga¬ 
tion, as opposed to the equation of motion for the “real” 
system ip, which naturally depends on an initial value 
condition at time zero. With these simple equations, we 
may already summarize what is needed from an imple¬ 
mentation point of view in order to perform basic QOCT 
calculations: 

The first step is the selection of the parameters u, that 
constitute the search space. Frequently, these parameters 
are simply the values that the control function (typically, 
the electric-field amplitude) takes at the time intervals 
that are used to discretize the propagation interval, i.e. it 
is a “real-time parametrization”. However, more sophis¬ 
ticated parametrizations allow fine-tuning of the search 
space, introducing constraints and penalties into the for¬ 
mulation. 

Then, one must choose an algorithm for maximizing 
multi-dimensional functions such as G. One possibility 
is the family of gradient-less algorithms, which only re¬ 
quire a procedure to compute the value of the function, 
and do not need the gradient. In this case, the previ¬ 
ous equations are obviously not needed. One only has 


to propagate the system forwards in time, which is what 
Octopus can do best. The value of the function G can 
then be computed from the evolution of obtained with 
this propagation, and fed into the optimization proce¬ 
dure. A few gradient-less algorithms are implemented in 
Octopus. 

The most efficient optimizations can be obtained if 
information about the gradient is employed. In that 
case, we can use standard schemes, such as the family of 
conjugate-gradient algorithms, or the Broyden-Fletcher- 
Goldfarb-Shanno (BFGS) quasi-Newton scheme - we use 
the implementation of these algorithms included in the 
GSL mathematical library |I23j . Some ad hoc algo¬ 
rithms, developed explicitly for QOGT, exist. These may 
in some circumstances be faster than the general pur¬ 
pose ones. Some of those are implemented in Octopus as 
well [T241IT261 . 


In order to compute the gradient, one must implement 
a backwards-propagation scheme for the costate, which 
does not differ from the ones used for the normal forwards 
propagation Note, however, that in some cases 

the backwards propagation does not have the exact same 
simple linear form than the forwards propagation, and 
may include inhomogeneous or non-linear terms. The 
final step is the computation of the gradient from the 
integral given in eq. (79). 

The formulation of QOGT we have just sketched out is 
quite generic; in our case the quantum systems are those 
that can be modeled with Octopus (periodic systems are 
not supported at the moment), and the handle that is 
used to control the system is a time-dependent electric 
field, such as the ones that can be used to model a laser 
pulse. The set of parameters {u}- define the shape of 
this electric field; for example, they can be the Fourier 
coefficients of the field amplitude. 

The usual formulation of QOCT assumes the linearity 
of quantum mechanics. However, the time-dependent KS 
equations are not linear, making both the theory and 
the numerics more complicated. We have extended the 
basic theory previously described to handle the TDDFT 
equations, and implemented the resulting equations in 
Octopus |I28j . 

We conclude this section by briefly describing some 
of the applications of the QOCT machinery included in 
Octopus, which can give an idea of the range of pos¬ 
sibilities that can be attempted. The study presented 
in Ref. |I29j demonstrates the control of single-electron 
states in a two-dimensional semiconductor quantum-ring 
model. The states whose transitions are manipulated are 
the current-carrying states, which can be populated or 
de-populated with the help of circularly polarized light. 

Reference ra studies double quantum dots, and 
shows how the electron state of these systems can be 
manipulated with the help of electric fields tailored by 
QOCT. 

Another interesting application is how to tailor the 
shape of femtosecond laser pulses in order to obtain max¬ 
imal ionization of atoms and molecules m- The system 












16 


chosen to demonstrate this possibility is the H J molecule, 
irradiated with short 5 fs) high-intensity laser pulses. 

The feasibility of using the electronic current to define 
the target functional of the QOCT formalism is consid¬ 
ered in Ref. |132j . 

Finally, a series of works has studied the use of op¬ 
timal control for photo-chemical control: the tailoring 
of laser pulses to create or break selected bonds in 
molecules. The underlying physical model should be 
based on TDDFT, and on a mixed quantum/classical 
scheme (within Octopus, Ehrenfest molecular dynam¬ 
ics). Some first attempts in this area were reported 
in Refs |133l I134j . However, these works did not con¬ 
sider a fully consistent optimal control theory encom¬ 
passing TDDFT and Ehrenfest dynamics. This theory 
has been recently presented |135j , and the first computa¬ 
tions demonstrating its feasibility will be reported soon. 


X. PLASMONICS 


The scope of real-space real-time approaches is not 
confined to the atomistic description of matter. Eor in¬ 
stance, finite-difference time-domain |136j (EDTD) is a 
standard numerical tool of computational electromag¬ 
netism, while lattice Boltzmann methods |137j (LBM) 
are widely used in computational ffuid dynamics. In¬ 
deed, real-space real-time approaches can be used to 
model physical processes on rather different space and 
time scales. This observation also bears an important 
suggestion: numerical methods based on real-space grids 
can be used to bridge between these different space and 
time scales. 

Numerical nanoplasmonics is a paradigmatic case for 
multiscale electronic-structure calculations. A nanoplas- 
monic system - e.^., made up of metal nanoparticles 
(MNPs) - can be a few tens of nanometers across, while 
the region of strong field enhancement - e.g., in the gap 
between two MNPs - can be less than 1 nm across |138j . 
The field enhancement, h (r), is essentially a classical ob¬ 
servable, defined as 


h{v) = 


/ 


(EL (r)) 
(EeL (r)) ’ 


(82) 


where Etot is the total electric field, Eext is the exter¬ 
nal (or driving) electric field, and (• • •) indicates a time 
average. Large field enhancements are the key to single 
molecule surface-enhanced Raman spectroscopy (SERS) 
and values as large as h > 100 (the intensity of the SERS 
signal scales as h^) are predicted by classical electromag¬ 
netic calculations |139j . 

In classical calculations, the electronic response is mod¬ 
eled by the macroscopic permittivity of the material. The 
classical Drude model gives the following simple and ro¬ 
bust approximation of the metal (complex) permittivity: 


■ (^) — ^oo 


UJ" 


uj {(jj iy) 


(83) 


Eor gold, typical values of the high-frequency permittiv¬ 
ity Coo, the plasma frequency cjp, and the relaxation rate 
7 , are: e^o = 9.5, fiuj = 8.95 eV and = 69.1 meV |140j . 
A non-local correction to the Drude model can also be in¬ 
cluded by considering the plasmon dispersion |141[ I142j . 
The metal (complex) permittivity then reads 

The parameter (3 can be fitted to model the experimen¬ 
tal data, although the value [3 = where vp 

is the Eermi velocity, is suggested by the Thomas-Eermi 
approximation. m 

Regardless of the level of sophistication of the permit¬ 
tivity model, all classical calculations assume that elec¬ 
trons are strictly confined inside the metal surfaces. This 
is a safe approximation for microscopic plasmonic struc¬ 
tures. However, at the nanoscale the electronic delocal¬ 
ization (or spillout) outside the metal surfaces becomes 
comparable to the smallest features of the plasmonic 
nanostructure, e.^., to the gap between two MNPs. In 
this scale, the very definition of a macroscopic permit¬ 
tivity is inappropriate and the electronic response must 
be obtained directly from the quantum dynamics of the 
electrons. 

TDDET is currently the method of choice to model the 
plasmonic response of MNPs |144I415Q] . via the simpli¬ 
fied jellium model, in which the nuclei and core electrons 
are described as a uniform positive charge density, and 
only the valence electrons are described explicitly. Early 
calculations - especially nanospheres [na usi] - have 
suggested the existence of new charge-transfer plasmonic 
modes, which have been also demonstrated by pioneering 
experiments |138j . In the future, as the field of quantum 
plasmonics |152j - z.e., the investigation and control of 
the quantum properties of plasmons - will further de¬ 
velop, the demand for accurate, yet scalable, numerical 
simulations to complement the experimental findings is 
expected to grow. This demand represents both a chal¬ 
lenge and an opportunity for computational physics. 

Scaling up the TDDET@jelhum method to model 
larger and more complex plasmonic nanostructures is a 
challenge which can be addressed by high-performance 
real-space real-time codes, like Octopus. The code has 
been initially applied to investigate the plasmonic re¬ 
sponse of single gold nanospheres (Wigner-Seitz radius, 
Ts = 3.0 bohr) |146j . A clear plasmonic resonance ap¬ 
pears in the absorption cross section - computed by real¬ 
time propagation - for spheres containing a large enough 
number of electrons > 100). A new plasmonic mode, 
deemed the “quantum core plasmon”, has been also sug¬ 
gested from the analysis of the absorption cross-section. 
This new mode has been further characterized by probing 
the sphere at its resonance frequency. Within a real-time 
propagation scheme, this is simply done by including an 
external electric field, the “laser pulse”, oscillating at a 
given frequency. 
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As versatility is a major strength of real-space real¬ 
time approaches, other jellium geometries can be easily 
modeled by Octopus, including periodic structures. For 
instance, a pair of interacting sodium nanowires (with 
periodicity along their longitudinal direction) has been 
investigated to assess the accuracy of classical methods 
based on the model permittivity in eq. ([8^ and eq. (84). 


Compared to pairs of nanospheres, nanowires display a 
stronger inductive interaction due to their extended ge¬ 
ometry [TTti es]. This is manifest in the absorption 
cross-section which already shows a large split of the 
plasmonic peak for a small gap between the wires (see 
Fig. [Tol^a)). Due to the electronic spillout and the sym¬ 
metry of the system, it also turns out that the largest field 
enhancement is reached at the center of the gap, not on 
the opposing surfaces of the nanowires as predicted by the 
classical methods (see Fig. [^b)). The maximum field 
enhancement estimated by the TDDFT@jellium method 
is also smaller than the classical estimates. Once again, 
the quantum delocalization ignored by the classical meth¬ 
ods plays a crucial role in “smearing” the singularities of 
the induced field, effectively curbing the local field en¬ 
hancement. 

Simple jellium geometries have been implemented in 
Octopus and they can be used as effective “superatomic 
pseudopotentials”. The similarity between the jellium 
potential and atomic pseudopotentials can be further ex¬ 
ploited to develop an external “jellium pseudopotential” 
generator to be used with Octopus. In this way, a larger 
selection of jellium geometries will be made available 
along with refined, yet scalable, jellium approaches to in¬ 
clude d electron screening in noble metals |153j . Efforts 
in this direction are being currently made. 

Finally, a word of caution about the domain of appli¬ 
cability of the TDDFT@jellium method is in order. The 
non-uniformity of the atomic lattice is expected to affect 
the absorption cross-section of small MNPs. A careful 
assessment of the lattice contributions - including the 
lattice symmetry - on the main plasmon modes of a pair 
of nanosphere is available This last investigation 

further demonstrates the possibility to bridge between 
atomistic and coarse-grained electronic calculations by 
means of a real-space real-time approach. 


XI. DEVELOPMENT OF EXCHANGE AND 
CORRELATION FUNCTIONALS 

The central quantity of the KS scheme of DFT is the 
xc energy Exc['^], which describes all non-trivial many- 
body effects. Clearly, the exact form of this quantity is 
unknown and it must be approximated in any practical 
application of DFT. We emphasize that the accuracy of 
any DFT calculation depends solely on the form of this 
quantity, as this is the only real approximation in DFT 
(neglecting numerical approximations that are normally 
controllable). 

During the past 50 years, hundreds of different forms 
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FIG. 10. Panel (a): Absorption cross section of a pair 
of sodium nanowires. The driving electric field is polar¬ 
ized as shown in the inset. Curves are for different val¬ 
ues of gap, d, between the nanowires, From top to bottom: 
d = 5, 2, 1, 0.5, 0.2, 0.1, 0 nm. Panel (b): Field enhance¬ 
ment, /i, for the case d = 0.5 nm. The black lines indicate the 
nanowire surfaces. (Adapted from Ref. [147] i 


have appeared m- They are usually arranged in fam¬ 
ilies, which have names such as generalized-gradient ap¬ 
proximations (GGAs), meta-GGAs, hybrid functionals, 
etc. In 2001, John Perdew came up with a beautiful 
idea on how to illustrate these families and their relation¬ 
ship |155j . He ordered the families as rungs in a ladder 
that leads to the heaven of “chemical accuracy”, which 
he christened the “Jacob’s ladder” of density-functional 
approximations for the xc energy. Every rung adds a 
dependency on another quantity, thereby increasing the 
precision of the functional but also increasing the numer¬ 
ical complexity and the computational cost. 

The first three rungs of this ladder are : (i) the local- 
density approximation (LDA), where the functional has a 
local dependence on the density only; (ii) the generalized- 
gradient approximation (GGA), which includes also a lo¬ 
cal dependence on the gradient of the density Vn(r); 
and (iii) the meta-GGA, which adds a local dependence 
on the Laplacian of the density and on the kinetic-energy 
density. In the fourth rung we have functionals that de¬ 
pend on the occupied KS orbitals, such as exact exchange 
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or hybrid functionals. Finally, the fifth rung adds a de¬ 
pendence on the virtual KS orbitals. 

Support for the first three rungs and for the local part 
of the hybrid functionals in Octopus is provided through 
the Libxc library |156j . Libxc started as a spin-off project 
during the initial development of Octopus. At that point, 
it became clear that the task of evaluating the xc func¬ 
tional was completely independent of the main structure 
of the code, and could therefore be transformed into a 
stand-alone library. Over the years, Libxc became more 
and more independent of Octopus, and is now used in a 
variety of DFT codes. There are currently more than 150 
xc functionals implemented in Libxc that are available in 
Octopus, a number that has been increasing steadily over 
the years. All of the standard functionals are included 
and many of the less common ones. There is also support 
for LDAs and GGAs of systems of reduced dimensional¬ 
ity (ID and 2D), which allow for direct comparisons with 
the direct solution of the many-body Schrodinger equa¬ 
tion for model systems described in section Eng 

Octopus also includes support for other functionals 
of the fourth rung, such as exact exchange or the 
self-interaction correction of Perdew and Zunger nm, 
through the solution of the optimized effective potential 
equation. This can be done exactly m, or within the 
Slater |159j or Krieger-Lee-Iafrate approximations [isni. 


Besides the functionals that are supported by Octopus, 
the code has served as a platform for the testing and de¬ 
velopment of new functionals. For example, the method 
described in section [XIIl| can be used in a straightforward 
way to obtain reference data against which to benchmark 
the performance of a given xc functional, for example a 
one-dimensional LDA m\- In that case, both calcula¬ 
tions, exact and approximate, make use of the same real- 
space grid approach, which makes the comparison of the 
results obtained with both straightforward. Despite the 
obvious advantage of using exact solutions of the many- 
body problem as reference data, this is often not possible 
and one usually needs to resort to the more commonly 
used experimental or highly-accurate quantum-chemistry 
data. In this case, the flexibility of the real-space method, 
allowing for the calculation of many different properties 
of a wide variety of systems, is again an advantage. Oc¬ 
topus has therefore been used to benchmark the perfor¬ 
mance of xc functionals whose potential has a correct 
asymptotic behavior |162j when calculating ionization 
potentials and static polarizabilities of atoms, molecules, 
and hydrogen chains. 


In this vein, Andrade and Aspuru-Guzik |163j pro¬ 
posed a method to obtain an asymptotically correct 
xc potential starting from any approximation. Their 
method is based on considering the xc potential as an 
electrostatic potential generated by a fictitious xc charge. 
In terms of this charge, the asymptotic condition is given 
as a simple formula that is local in real space and can 
be enforced by a simple procedure. The method, im¬ 
plemented in Octopus, was used to perform test calcu¬ 
lations in molecules. Additionally, with this correction 


procedure it is possible to find accurate predictions for 
the derivative discontinuity and, hence, predict the fun- 
damentai gap |164j . 


XII. REAL-SPACE REDUCED 
DENSITY-MATRIX FUNCTIONAL THEORY 


An aiternative approach to DFT that can modei eiec- 
trons using a singie-particie framework is reduced den¬ 
sity matrix functionai theory (RDMFT) |165j . Here, we 
present the current resuits of an ongoing effort to deveiop 
a reai-space version of RDMFT and to impiement it in 
the Octopus code. 

Within RDMFT, the totai energy of a system is given 
as a functionai of the one-body reduced density-matrix 
(1-RDM) 


^ I I 

(85) 

which can be written in its spectrai representation as 

CX) 

7 (r, r') = Y2 (86) 

where the naturai orbitais (j)i{r) and their occupation 
numbers rii are the eigenfunctions and eigenvaiues of the 
1-RDM, respectiveiy. 

In RDMFT the totai energy is given by 
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The third term is the Hartree energy, and the fourth 
the xc energy, F^xc- As in DFT, the exact functionai of 
RDMFT is unknown. However, the part that needs to 
be approximated, F’xcM, comes, contrary to DFT, oniy 
from the eiectron-eiectron interaction, as the interacting 
kinetic energy can be expiicitiy expressed in terms of 7 . 
Different approximate functionais are empioyed and min¬ 
imized with respect to the 1-RDM in order to find the 
ground state energy |166H168] . A common approxima¬ 
tion for F’xc is the Miiiier functionai [TM] . which has the 
form 
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and is the only E^c implemented in Octopus for the mo¬ 
ment. 

For closed-shell systems, the necessary and sufficient 
conditions for the 1-RDM to be A/'-representable [no], 
i.e. to correspond to a A^-electron wavefunction, is that 
0 < rii < 2 and 


E' 


N. 


(89) 


Minimization of the energy functional of eq. (87) is per¬ 


formed under the A/'-representability constraints and the 
orthonormality requirements of the natural orbitals, 


{(pi\(pj) = 6i^ 


(90) 


The bounds on the occupation numbers are automati¬ 
cally satisfied by setting rii = 2sin^(27n^ and varying 
i9i without constraints. The conditions (1^ and (1^ are 


taken into account via Lagrange multipliers /i and Xij , re¬ 
spectively. Then, one can define the following functional 


Q[N,{^i},{(j)i{r)}] = E - ^ \^^2sm\2n^i) - Nj 

OO 

which has to be stationary with respect to variations in 
{'di}, and {0*(r’)}. In any practical calculation 

the infinite sums have to be truncated including only a fi¬ 
nite number of occupation numbers and natural orbitals. 
However, since the occupation numbers rij decay very 
quickly for j > N, this is not problematic. 

The variation of Cl is done in two steps: for a fixed set of 
orbitals, the energy functional is minimized with respect 
to occupation numbers and, accordingly, for a fixed set 
of occupations the energy functional is minimized with 
respect to variations of the orbitals until overall conver¬ 
gence is achieved. As a starting point we use results from 
a Hartree-Fock calculation and first optimize the occu¬ 
pation numbers. Since the correct /i is not known, it is 
determined via bisection: for every /i the objective func¬ 
tional is minimized with respect tountil the condition 
(89) is satisfied. 


Due to the dependence on the occupation numbers, the 
natural-orbital minimization does not lead to an eigen¬ 
value equation like in DFT or Hartree-Fock. The im¬ 
plementation of the natural orbital minimization follows 
the method by Piris and Ugalde |171j . Varying Cl with 
respect to the orbitals for fixed occupation numbers one 
obtains 


Xji — Tli 
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(92) 

At the extremum, the matrix of the Lagrange multipliers 
must be Hermitian, i.e. 


Then one can define the off-diagonal elements of a Her¬ 
mitian matrix F as: 

Fji = dii - - X*j) + dij - i){X*j - Xji), (94) 

where 0 is the unit-step Heaviside function. We initialize 
the whole matrix as Fji = {\ji + A*^)/ 2 . In every itera¬ 
tion we diagonalize F, keeping the diagonal elements for 
the next iteration, while changing the off-diagonal ones to 
(94). At the solution all off-diagonal elements of this ma¬ 


trix vanish, hence, the matrices F and 7 can be brought 
simultaneously to a diagonal form. Thus, the {0^} which 
are the solutions of eq. (93) can be found by diagonal- 


Aii - A*,. = 0 . 


(93) 


ization of F in an iterative manner HZT]. The criterion 
to exit the natural-orbital optimization is that the dif¬ 
ference in the total energies calculated in two successive 
F diagonalizations is smaller than a threshold. Overall 
convergence is achieved when the difference in the total 
energies in two successive occupation-number optimiza¬ 
tions and the non-diagonal matrix elements of F are close 
to zero. 

As mentioned above, one needs an initial guess for the 
natural orbitals both for the first step of occupation- 
number optimization but also for the optimization of 
the natural orbitals. A rather obvious choice would be 
the occupied and a few unoccupied orbitals resulting 
from a DFT or HF calculation. Unfortunately, there are 
unbound states among the HF/DFT unoccupied states 
which are a bad starting point for the weakly occupied 
natural orbitals. When calculated in a finite grid these 
orbitals are essentially the eigenstates of a particle in a 
box. Using the exact-exchange approximation (EXX) in 
an optimized-effective-potential framework results in a 
larger number of bound states than HF or the local den¬ 
sity approximation (LDA) due to the EXX functional be¬ 
ing self-interaction-free for both occupied and unoccupied 
orbitals. Using HE or LDA orbitals to start a RDMET 
calculation, the natural orbitals do not converge to any 
reasonable shape, but even when starting from EXX one 
needs to further localize the unoccupied states. Thus, we 
have found that in order to improve the starting point for 
our calculation we can multiply each unoccupied orbital 
by a set of Gaussian functions centered at the positions 
of the atoms. As the unbound states are initially more 
delocalized than the bound ones, we choose a larger ex¬ 
ponent for them. 

In Eig. we show the dissociation curve of H 2 ob¬ 
tained with RDMET in Octopus and compare it with 
results obtained by the Gaussian-basis-set RDMET code 
HIPPO |I72j . Eor the Octopus calculation, we kept 13 
natural orbitals with the smallest occupation number be¬ 
ing of the order of 10“^ after the RDMET calculation had 
converged. The HIPPO calculation was performed using 
30 natural orbitals. The RDMET curve obtained with 
Octopus looks similar to the one from HIPPO and other 
Gaussian implementations of RDMET |I 66 j , keeping the 
nice feature of not diverging strongly in the dissociation 
limit. However, for internuclear distances R bigger than 
I a.u., the real-space energy lies above the HIPPO one. 
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We believe that the remaining difference can be removed 
by further improving the initial guess for the orbitals that 
we use in Octopus, because a trial calculation using HF 
orbitals from a Gaussian implementation showed a curve 
almost identical to the one from the HIPPO code (not 
shown in the figure). In the future, we plan to include 
support for open-shell systems and additional xc func¬ 
tionals. 



FIG. 11. Dissociation curve of the hydrogen molecule. Re¬ 
stricted Hartree-Fock (black dotted and red dash-dotted lines) 
does not dissociate into two neutral atoms while the closed- 
shell RDMFT gives almost the correct energy of -1 Ha at the 
dissociation limit in a Gaussian implementation. For the grid 
implementation in Octopus, a deviation from the constant 
energy at large R remains. 


XIII. EXACT SOLUTION OF THE 
MANY-BODY SCHRODINGER EQUATION FOR 
FEW ELECTRONS 

In one-dimensional systems, the fully interacting 
Hamiltonian for N electrons has the form 

H ^ ^ I — 2 “l“'^ext(^j) I ff ^ ^ '^int {^j ; ^fe) ; (9^) 

i = l \ ) j<k 

where the interaction potential Vi^t{xj,Xk) is usually 
Coulombic, though the following discussion also applies 
for other types of interaction, including more than two- 
body ones. In ID one ofte n uses the soft Coulomb inter¬ 
action 1 /+ l, where a softening parameter 
(usually set to one) is introduced in order to avoid the 
divergence at Xj = which is non-integrable in ID. 

Mathematically, the Hamiltonian (eq. ([9^) is equiv¬ 
alent to that of a single (and hence truly independent) 
electron in N dimensions, with the external potential 

N N 

V^J{xi...Xn) = ^ Vext(a:j) + '^Vint(Xj,Xk). (96) 

j=l j<k 


For small N it is numerically feasible to solve the N- 
dimensional Schrodinger equation 

H^j{xi...XN) = Ej^j{xi...XN) (97) 


which provides a spatial wave function for a single par¬ 
ticle in N dimensions. This equivalence is not restricted 
to one-dimensional problems. One can generally map a 
problem of N electrons in d dimensions onto the problem 
of a single particle in Nd dimensions, or indeed a prob¬ 
lem with multiple types of particles {e.g. electrons and 
protons) in d dimensions, in the same way. 

What we exploit in Octopus is the basic machinery for 
solving the Schodinger equation in an arbitrary dimen¬ 
sion, the spatial/grid bookkeeping, the ability to repre¬ 
sent an arbitrary external potential, and the intrinsic par¬ 
allelization. In order to keep our notation relatively sim¬ 
ple, we will continue to discuss the case of an originally 
one-dimensional problem with N electrons. Grid-based 
solutions of the full Schrodinger equation are not new, 
and have been performed for many problems with either 
few electrons (in particular H 2 , D 2 and HJ) |173| I174j ) 
or model interactions |175j . including time-dependent 
cases |176j . 


The time-dependent propagation of the Schrodinger 
equation can be carried out in the same spirit, since the 
Hamiltonian is given explicitly and each “single-particle 
orbital” represents a full state of the system. A laser or 
electric-field perturbation can also be applied, depending 
on the charge of each particle (given in the input), and 
taking care to apply the same effective field to each par¬ 
ticle along the polarization direction of the field (in ID, 
the diagonal of the hyper-cube). 

Solving eq. (97) leaves the problem of constructing a 
wave function which satisfies the anti-symmetry prop¬ 
erties of N electrons in one dimension. For fermions 
one needs to ensure that those spatial wave functions 
which are not the spatial part of a properly anti¬ 
symmetric wave function are removed as allowed solu¬ 
tions for the A-electron problem. A graphical represen¬ 
tation of which wave functions are allowed is given by 
the Young diagrams (or tableaux) for permutation sym¬ 
metries, where each electron is assigned a box, and the 
boxes are then stacked in columns and rows (for details 
see, for example. Ref. |177j b Each box is labeled with 
a number from 1 to A such that the numbers increase 
from top to bottom and left to right. 


All possible decorated Young diagrams for three and 
four electrons are shown in Fig. Since there are two 
different spin states for electrons, our Young diagrams 
for the allowed spatial wave functions contain at most 
two columns. The diagram d) is not allowed for the wave 
function of three particles with spin 1/2, and diagrams 
k) to n) are not allowed for four particles. To connect a 
given wave function with a diagram one has to sym¬ 
metrize the wave function according to the diagram. For 
example, for diagram b) one would perform the following 
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operations on a function ^(xi,X 2 ,X 3 ) 


[^{xi,X2,Xs) + ^{x2,Xi,Xs)] 
- [^{x3,X2,Xi) 


^{X3,XI,X2)] . (98) 


Hence, one symmetrizes with respect to an interchange 
of the first two variables, because they appear in the 
same row of the Young diagram, and anti-symmetrizes 
with respect to the first and third variable, as they ap¬ 
pear on the same column. We note that we are referring 
to the position of the variable in the list, not the in¬ 
dex, and that symmetrization always comes before anti- 
symmetrization. At the end of these operations one cal¬ 
culates the norm of the resulting wave function. If it 
passes a certain threshold, by default set to 10“^, one 
keeps the obtained function as a proper fermionic spatial 
part. If the norm is below the threshold, one contin¬ 
ues with the next allowed diagram until either a norm 
larger than the threshold is found or all diagrams are 
used up. If a solution does not yield a norm above 
the threshold for any diagram it is removed since it cor¬ 
responds to a wave function with only bosonic or other 
non-fermionic character. Generally, as the number of for¬ 
bidden diagrams increases with Y, the number of wave 
functions that need to be removed also increases quickly 
with Y, in particular in the lowest part of the spectrum. 
The case of two electrons is specific, as all solutions of 
eq. (97) correspond to allowed fermionic wave functions: 


the symmetric ones to the singlet states and the anti¬ 
symmetric ones to the triplet states. 


a) 


b) 


1 

2 

c) 
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FIG. 12. Young diagrams for three [a)-d)] and four [e)-n)] 
electrons. For three electrons, only diagrams a)-c) are allowed 
for spin-1/2 particles, while only diagrams e)-j) are allowed 
for four electrons. 


For example, for a one-dimensional Li atom with an 
external potential 

'^ext(^) =- ! f ^ (99) 


\fx^ 


1 


State 

Energy Young diagram 

Norm 

1 

-4.721 

bosonic 

< 10“^^ 

2 

-4.211 

b) 

0.2 

3 

-4.211 

c) 

0.6 

4 

-4.086 

bosonic 

< 10“^^ 

5 

-4.052 

b) 

0.4 

6 

-4.052 

c) 

0.7 


TABLE III. Eigenstates for a one-dimensional lithium atom. 
The first and the fourth eigenstates show norms that are 
smaller than 10“^^ and 10“^^, respectively, for all diagrams. 
Hence, these states are bosonic and removed from any further 
calculations. The second and third states are energetically de¬ 
generate and correspond to diagrams b) and c) in Eig. 12 The 
same is true for the fifth and sixth states. 


and the soft Goulomb interaction, we obtain the states 
and energy eigenvalues given in table [Hlj 

If certain state energies are degenerate, the Young dia¬ 
gram “projection” contains an additional loop, ensuring 
that the same diagram is not used to symmetrize suc¬ 
cessive states: this would yield the same spatial part for 
each wave function in the degenerate sub-space. A given 
diagram is only used once in the sub-space, on the first 
state whose projection has significant weight. 

The implementation also allows for the treatment of 
bosons, in which case the total wave function has to be 
symmetric under exchange of two particles. Here one 
will use a spin part symmetrized with the same Young 
diagram (instead of the mirror one for fermions), such 
that the total wave function becomes symmetric. 

In order for the (anti-)symmetrization to work properly 
one needs to declare each particle in the calculation to 
be a fermion, a boson, or an anyon. In the latter case, 
the corresponding spatial variables are not considered at 
all in the (anti-)symmetrization procedure. One can also 
have more than one type of fermion or boson, in which 
case the symmetric requirements are only enforced for 
particles belonging to the same type. 

There are also numerical constraints on the wave- 
functions: space must be represented in a homoge¬ 
neous hyper-cube, eventually allowing for different parti¬ 
cle masses by modifying the kinetic-energy operator for 
the corresponding directions. All of the grid-partitioning 
algorithms intrinsic to octopus carry over to arbitrary di¬ 
mensions, which allows for immediate parallelization of 
the calculations of the ground and excited states. The 
code can run with an arbitrary number of dimensions, 
however, the complexity and memory size grow expo¬ 
nentially with the number of particles simulated, as ex¬ 
pected. Production runs have been executed up to 6 or 
7 dimensions. 

Most of the additional treatment for many-body quan¬ 
tities is actually post-processing of the wave-functions. 
For each state, the determination of the fermionic or 
bosonic nature by Young-tableau symmetrization is fol¬ 
lowed by the calculation and output of the density for 
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each given particle type, if several are present. Other 
properties of the many-body wave-function can also be 
calculated. For example, Octopus can also output the 
one-body density matrix, provided in terms of its occu¬ 
pation numbers and natural orbitals. 

This type of studies, even when they are limited to 
model systems of a few electrons, allows us to produce 
results that can be compared to lower levels of the¬ 
ory like approximate DFT or RDMFT, and to develop 
better approximations for the exchange and correlation 
term. Exact results obtained from such calculations 
have been used to assess the quality of a ID LDA func¬ 
tional HM] and adiabatic ID LDA and exact exchange in 
a TDDFT calculation calculation of photoemission spec¬ 
tra pmi[T78] . 


XIV. COMPRESSED SENSING AND 
ATOMISTIC SIMULATIONS 


In order to obtain frequency-resolved quantities from 
real-time methods like molecular dynamics or electron 
dynamics, it is necessary to perform a spectral represen¬ 
tation of the time-resolved signal. This is a standard op¬ 
eration that is usually performed using a discrete Fourier 
transform. Since the resolution of the spectrum is given 
by the length of the time signal, it is interesting to look 
for more methods that can provide us a spectrum of sim¬ 
ilar quality with shorter time series, as this is directly re¬ 
flected in shorter computation times. Several such meth¬ 
ods exist, but a particular one that has been explored in 
Octopus, due to its general applicability and efficiency, 
is compressed sensing. 

Compressed sensing is a general theory aimed at 
optimizing the amount of sampling required to recon¬ 
struct a signal. It is based on the idea of sparsity, a mea¬ 
sure of how many zero coefficients a signal has when rep¬ 
resented in a certain basis. Compressed sensing has been 
applied to many problems in experimental sciences pm 
1182] and technology |183[ 1184] in order to perform more 
accurate measurements. Its ideas, however, can also be 
applied to computational work. 

In order to calculate a spectrum in compressed sensing, 
we need to solve the so-called basis-pursuit optimization 
problem 

min|cr|i subject to Fcr = r , (100) 

<y 


where |cr|i = \ak\ is the standard 1-norm, r is the 
discretized time series, a is the frequency-resolved func¬ 
tion (the spectrum that we want to calculate) and F is 
the Fourier-transform matrix. 

Since r is a short signal, its dimension is smaller than 
the one of cr. This implies that the linear equation Fcr = 
T is under-determined and has many solutions, in this 
particular case, all the spectra that are compatible with 
our short tim e pro pagation. From all of these possible 
solutions, eq. (100) takes the one that has the smallest 1- 
norm, that corresponds to the solution that has the most 


zero coefficients. For spectra, this means we are choosing 
the one with the fewest frequencies, which will tend to 
be the physical one, as for many cases we know that the 
spectra is com posed of a discrete number of frequencies. 

To solve eq. (100) numerically, we have implemented in 

The solution typ- 


Octopus the SPGLl algorithm 
ically takes a few minutes, which is two orders of mag¬ 
nitude more expensive than the standard Fourier trans¬ 
form, but this is negligible in comparison with the cost 
of the time propagation. 

By applying compressed sensing to the determination 
of absorptional or vibrational spectra, it was found that 
a time signal a fifth of the length can be used in com¬ 
parison with the standard Fourier transform [35]. This 
is translated into an impressive factor-of-five reduction 
in the computational time. This is illustrated in Fig. 
where we show a spectrum calculated with compressed 
sensing from a 10 fs propagation, which has a resolution 
similar to a Fourier transform spectrum obtained with 50 
fs of propagation time. 

Moreover, the general conclusion that can be obtained 
from this work is that in the application of compressed 
sensing to simulations the reduction in the number of 
samples that compressed sensing produces in an experi¬ 
mental setup is translated into a reduction of the com¬ 
putational time. This concept inspired studies on how 
to carry the ideas of compressed sensing into the core 
of electronic-structure simulations. The first result of 
this effort is a method to use compressed sensing to re¬ 
construct sparse matrices, that has direct application in 
the calculation of the Hessian matrix and vibrationa l fre - 
quencies from linear response (as discussed in section III). 
For this case, our method results in the reduction of the 
computational time by a factor of three |186] . 



FIG. 13. Optical absorption spectrum from a methane 
molecule from real-time TDDFT. Comparison of the calcu¬ 
lation using a Fourier transform and a propagation time of 
50 fs (top, black curve) with compressed sensing and a prop¬ 
agation time of 10 fs (bottom, blue curve). Compressed sens¬ 
ing produces a similar resolution, with a propagation 5 times 
shorter. 
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XV. PARALLELIZATION, OPTIMIZATIONS 
AND GRAPHICS PROCESSING UNITS 


Laplacian operator running on a GPU by using a Hilbert 
curve to map the grid into memory. 


Computational cost has been and still is a fundamental 
factor in the development of electronic structure meth¬ 
ods, as the small spatial dimensions and the fast move¬ 
ment of electrons severely limit the size of systems that 
can be simulated. In order to study systems of inter¬ 
est as realistically and accurately as possible, electronic- 
structure codes must execute efficiently in modern com¬ 
putational platforms. This implies support for massively 
parallel platforms and modern parallel processors, includ¬ 
ing graphics processing units (GPUs). 

Octopus has been shown to perform efficiently on par¬ 
allel supercomputers, scaling to hundreds of thousands 
of cores pmiwi- The code also has an implementation 
of GPU acceleration [35l I188j that has shown to be com¬ 
petitive in performance with Gaussian DFT running on 
GPUs [US]. 


Performance is not only important for established 
methods, but also for the implementation of new ideas. 
The simplicity of real-space grids allows us to provide Oc¬ 
topus developers with building blocks that they can use 
to produce highly efficient code without needing to know 
the details of the implementation, isolating them as much 
as possible from the optimization and parallelization re¬ 
quirements. In most cases, these building blocks allow 
developers to write code that is automatically parallel, 
efficient, and that can transparently run on GPUs. The 
type of operations available run from simple ones, like 
integration, linear algebra, and differential operators, to 
more sophisticated ones, like the application of a Hamil¬ 
tonian or solvers for differential equations. 

However, it is critical to expose an interface with the 
adequate level that hides the performance details, while 
still giving enough flexibility to the developers. For ex¬ 
ample, we have found that the traditional picture of a 
state as the basic object is not adequate for optimal 
performance, as it does not expose enough data paral¬ 
lelism |188j . In Octopus we use a higher-level interface 
where the basic object is a group of several states. 

In the case of functions represented on the grid, the 
developers work with a linear array that contains the 
values of the field for each grid point. Additional data 
structures provide information about the grid structure. 
This level of abstraction makes it simple for developers 
to write code that works for different problem dimension¬ 
ality, and different kinds and shapes of grids. 

In terms of performance, by hiding the structure of 
the grid, we can use sophisticated methods to control 
how the grid points are stored in memory with the ob¬ 
jective of using processor caches more efficiently in finite- 
difference operators. We have found that by using space¬ 
filling curves as shown in Fig.jlffi and in particular 

the Hilbert curve |191[ I192j , we can produce a significant 
improvement in the performance of semi-iocai operations. 
For exampie, in Fig{^ shows that a performance gain 
of around 50% can be obtained for the finite-difference 



FIG. 14. Examples of different mappings from a 2D grid 
to a linear array: (a) standard map, (b) grid mapped by 
small parallelepipedic subgrids, and (c) mapping given by a 
Hilbert space-filling curve. These last two mappings provide 
a much better memory locality for semi-local operations than 
the standard approach. 



FIG. 15. Numerical performance of the Octopus finite- 
difference Laplacian implementation using different grid map¬ 
pings. Spherical grid with 500,000 points. Gomputations with 
a AMD Radeon 7970 GPU. A speed up of around 50% is ob¬ 
served for the subgrid and Hilbert curve mappings. 


Parallelization in Octopus is performed on different 
levels. The most basic one is domain decomposition, were 
the grid is divided in different regions that are assigned to 
each processor. For most operations, only the boundaries 
of the regions need to be communicated among proces¬ 
sors. Since the grid can have a complicated shape dic¬ 
tated by the shape of the molecule, it is far from trivial to 
distribute the grid-points among processors. For this task 
we use a third-party library called ParMETIS |193j . 
This library provides routines to partition the grid en¬ 
suring a balance of points and minimizing the size of the 
boundary regions, and hence the communication costs. 
An example of grid partitioning is shown in Fig. 

Additional parallelization is provided by other data 
decomposition approaches that are combined with do¬ 
main decomposition. This includes parallelization over 
KS states, and over /c-points and spin. The latter paral¬ 
lelization strategy is quite efficient, since for each /c-point 
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or spin component the operations are independent. How¬ 
ever, it is limited by the size of the system, and often is 
not available (as in the case of closed-shell molecules, for 
example). 

The efficiency of the parallelization over KS states de¬ 
pends on the type of calculation being performed. For 
ground state calculations, the orthogonalization and sub¬ 
space diagonalization routines BMl require the commu¬ 
nication of states. In Octopus this is handled by parallel 
dense linear-algebra operations provided by the ScaLA- 
PACK library |195j . For real-time propagation, on the 
other hand, the orthogonalization is preserved by the 
propagation [34] and there is no need to communicate 
KS states between processors. This makes real-time 
TDDFT extremely efficient in massively parallel comput¬ 
ers [35111^. 

An operation that needs special care in parallel is the 
solution of the Poisson equation. Otherwise, it consti¬ 
tutes a bottleneck in parallelization, as a single Poisson 
solution is required independently of the number of states 
in the system. A considerable effort has been devoted to 
the problem of finding efficient parallel Poisson solvers 
that can keep up with the rest of the code mi- We have 
found that the most efficient methods are based on FFTs, 
which require a different domain decomposition to per¬ 
form efficiently. This introduces the additional problem 
of transferring the data between the two different data 
partitions. In Octopus this was overcome by creating a 
mapping at the initialization stage and using it during 
execution to efficiently communicate only the data that 
is strictly necessary between processors m- 



FIG. 16. Example of adaptive mesh partitioning for a 
molecule of chlorophyll a. Simplified mesh with a spacing of 
0.5 A and a radius of 2.5 A. Each color represents a domain, 
which will be distributed to a set of processors for parallel 
computation. 


XVI. CONCLUSIONS 

In this article, we have shown several recent develop¬ 
ments in the realm of electronic-structure theory that 
have been based on the Octopus real-space code and 


made possible in part by the flexibility and simplicity 
of working with real-space grids. Most of them go be¬ 
yond a mere implementation of existing theory and rep¬ 
resent new ideas in their respective areas. We expect 
that many of these approaches will become part of the 
standard tools of physicists, chemists and material sci¬ 
entists, and in the future will be integrated into other 
electronic-structure codes. 

These advances also illustrate the variety of applica¬ 
tions of real-space electronic structure, many of which 
going beyond the traditional calculation schemes used in 
electronic structure, and might provide a way forward to 
tackle current and future challenges in the field. 

What we have presented also shows some of the cur¬ 
rent challenges in real-space electronic structure. One 
example is the use of pseudo-potentials or other forms of 
projectors to represent the electron-ion interaction. Non¬ 
local potentials introduce additional complications on 
both the formulation, as shown by the case of magnetic 
response, and the implementation. Pseudo-potentials 
also include an additional, and in some cases, not well- 
controlled approximation. It would be interesting to 
study the possibility of developing an efficient method 
to perform full-potential calculations without additional 
computational cost, for example by using adaptive or ra¬ 
dial grids. 

Another challenge for real-space approaches is the cost 
of the calculation of two-body Coulomb integrals that ap¬ 
pear in electron-hole linear response, RDMFT or hybrid 
xc functionals. In real-space these integrals are calcu¬ 
lated in linear or quasi-linear time by considering them 
as a Poisson problem. However, the actual numerical cost 
can be quite large when compared with other operations. 
A fast approach to compute these integrals, perhaps by 
using an auxiliary basis, would certainly make the real- 
space approach more competitive for some applications. 

The scalability of real-space grid methods makes them 
a good candidate for electronic-structure simulations in 
the future exafiop supercomputing systems expected for 
the end of the decade. In this aspect, the challenge is to 
develop high-performance implementations that can run 
efficiently on these machines. 
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